276
19.13
As can be seen, these prediction programs are quite easy to use and provide a relatively
quick first insight into possible TFBS, such as unknown sequences, but usually show a
high abundance of predicted binding sites. In this context, it is important to know the exact
parameters of the individual programs in order to obtain meaningful results for further
experimental investigation. If one is careless and chooses, for example, a too high “dis
similarity rate”, I may get hits that are biologically none at all. Consequently, for further
investigations, the position with the lowest dissimilarity rate should always be selected for
the desired TFBS, i.e. the one with a high match to the search template (here for NF-AT2
e.g. position 632–640 with a dissimilarity rate of <5%). In any case, it is necessary to vali
date bioinformatically predicted TFBSs experimentally. Only then can I be sure that the
transcription factor found actually has an effect on gene expression, otherwise only the
DNA nucleotides of the prediction match (which is why I got a hit), but this has no biologi
cal relevance.
Finally, another option is to label the genome sequence, examine it with BLAST, and
thereby immediately identify the proteins it contains. For example, Psi-BLAST allows me
19 Tutorial: An Overview of Important Databases and Programs